Bimodal Distribution and Co-Bursting in Review Spam Detection
نویسندگان
چکیده
Online reviews play a crucial role in helping consumers evaluate and compare products and services. This critical importance of reviews also incentivizes fraudsters (or spammers) to write fake or spam reviews to secretly promote or demote some target products and services. Existing approaches to detecting spam reviews and reviewers employed review contents, reviewer behaviors, star rating patterns, and reviewer-product networks for detection. In this research, we further discovered that reviewers’ posting rates (number of reviews written in a period of time) also follow an interesting distribution pattern, which has not been reported before. That is, their posting rates are bimodal. Multiple spammers also tend to collectively and actively post reviews to the same set of products within a short time frame, which we call co-bursting. Furthermore, we found some other interesting patterns in individual reviewers’ temporal dynamics and their co-bursting behaviors with other reviewers. Inspired by these findings, we first propose a two-mode Labeled Hidden Markov Model to model spamming using only individual reviewers’ review posting times. We then extend it to the Coupled Hidden Markov Model to capture both reviewer posting behaviors and co-bursting signals. Our experiments show that the proposed model significantly outperforms state-of-the-art baselines in identifying individual spammers. Furthermore, we propose a cobursting network based on co-bursting relations, which helps detect groups of spammers more effectively than existing approaches.
منابع مشابه
Modeling Review Spam Using Temporal Patterns and Co-bursting Behaviors
Online reviews play a crucial role in helping consumers evaluate and compare products and services. However, review hosting sites are often targeted by opinion spamming. In recent years, many such sites have put a great deal of effort in building effective review filtering systems to detect fake reviews and to block malicious accounts. Thus, fraudsters or spammers now turn to compromise, purcha...
متن کاملAn Effective Model for SMS Spam Detection Using Content-based Features and Averaged Neural Network
In recent years, there has been considerable interest among people to use short message service (SMS) as one of the essential and straightforward communications services on mobile devices. The increased popularity of this service also increased the number of mobile devices attacks such as SMS spam messages. SMS spam messages constitute a real problem to mobile subscribers; this worries telecomm...
متن کاملA Novel Hybrid Approach for Email Spam Detection based on Scatter Search Algorithm and K-Nearest Neighbors
Because cyberspace and Internet predominate in the life of users, in addition to business opportunities and time reductions, threats like information theft, penetration into systems, etc. are included in the field of hardware and software. Security is the top priority to prevent a cyber-attack that users should initially be detecting the type of attacks because virtual environments are not moni...
متن کاملA New Hybrid Approach of K-Nearest Neighbors Algorithm with Particle Swarm Optimization for E-Mail Spam Detection
Emails are one of the fastest economic communications. Increasing email users has caused the increase of spam in recent years. As we know, spam not only damages user’s profits, time-consuming and bandwidth, but also has become as a risk to efficiency, reliability, and security of a network. Spam developers are always trying to find ways to escape the existing filters therefore new filters to de...
متن کاملCoSpa: A Co-training Approach for Spam Review Identification with Support Vector Machine
Spam reviews are increasingly appearing on the Internet to promote sales or defame competitors by misleading consumers with deceptive opinions. This paper proposes a co-training approach called CoSpa (Co-training for Spam review identification) to identify spam reviews by two views: one is the lexical terms derived from the textual content of the reviews and the other is the PCFG (Probabilistic...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017